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Ling Feng, Elizabeth Chang, Tharam Dillon 



October 2002 ACM Transactions on Information Systems (TOIS), Volume 20 issue 4 
Full text available: g pdf(285.64 KB) 



Additional Information: full citation , abst ra ct, r ef e rences , c i tin gs, index 
terms 



The extensible Markup Language (XML) is fast emerging as the dominant standard for 
describing and interchanging data among various systems and databases on the Internet. It 
offers the Document Type Definition (DTD) as a formalism for defining the syntax and 
structure of XML documents. The XML Schema definition language, as a replacement for the 
DTD, provides more rich facilities for defining and constraining the content of XML 
documents. However, it does not concentrate on the semantics that und ... 



Keywords: XML, XML Schema, conceptual modeling, design methodology, semantic 
network 



2 Fast de t ection of communication patterns in distributed executions 
Thomas Kunz, Michiel F. H. Seuren 

November 1997 Proceedings of the 1997 conference of the Centre for Advanced Studies 
on Collaborative research 

Full text available: ^ pdf (4.21 MB ) Additional Information: full citation , abstract , references , index terms 

Understanding distributed applications is a tedious and difficult task. Visualizations based on 
process-time diagrams are often used to obtain a better understanding of the execution of 
the application. The visualization tool we use is Poet, an event tracer developed at the 
University of Waterloo. However, these diagrams are often very complex and do not provide 
the user with the desired overview of the application. In our experience, such tools display 
repeated occurrences of non-trivial commun ... 

3 Proximal nodes: a model to query document databases by content and structure 
Gonzalo Navarro, Ricardo Baeza-Yates 

October 1997 ACM Transactions on Information Systems (TOIS), Volume 15 Issue 4 

Full text available* S pdf(550 43 KB) Additional Information: full citation , abstract , references , citin gs, index 

terms , review 

A model to query document databases by both their content and structure is presented. The 



http://portal.acm.org/resu^ 9/5/05 



Results (page 1): mapping and relationship and document and hierarchical and nodes 



Page 2 of 6 



goal is to obtain a query language that is expressive in practice while being efficiently 
implementable, features not present at the same time in previous work. The key ideas of 
the model are a set-oriented query language based on operations on nearby structure 
elements of one or more hierarchies, together with content and structural indexing and 
bottom-up evaluation. The model is evaluated in regard t ... 

Keywords: expressivity and efficiency of query languages, hierarchical documents, 
structured text, text algebras 



4 Special issue: Al in engineering 
D. Sriram, R. Joobbani 

January 1985 ACM SIGART Bulletin, issue 91 

Full text available: ^pdf(8.79 MB) Additional Information: full citation , abstract 

The papers in this special issue were compiled from responses to the announcement in the 
July 1984 issue of the SIGART newsletter and notices posted over the ARPAnet. The interest 
being shown in this area is reflected in the sixty papers received from over six countries. 
About half the papers were received over the computer network. 




5 Knowled g e mana g ement session 4: indexi ng : Bootstrap pi n g fo r hi erarchical document Q 
classification 

Giordano Adami, Paolo Avesani, Diego Sona 

November 2003 Proceedings of the twelfth international conference on Information and 
knowledge management 

Full text available: |l] pdf(1 8 0 .7 3 K B ) Additional Information: full citation , abstract , references , index terms 

Managing the hierarchical organization of data is starting to play a key role in the knowledge 
management community due to the great amount of human resources needed to create and 
maintain these organized repositories of information. Machine learning community has in 
part addressed this problem by developing hierarchical supervised classifiers that help 
maintainers to categorize new resources within given hierarchies. Although such learning 
models succeed in exploiting relational knowledge, they ... 

Keywords: TaxSOM, constrained clustering, k-means, taxonomy bootstrapping process, 
text categorization 



6 Multimedia data indexing: Looking at mapping, indexing & querying of MPEG-7 
descriptors in RDBMS with SM3 
Yang Chu, Liang-Tien Chia, Sourav S. Bhowmick 

November 2004 Proceedings of the 2nd ACM international workshop on Multimedia 
databases 

Full text available: *g| pdf(279.92 KB) Additional Information: full citation , abstract , references , index terms 

MPEG-7 documents, which are primarily for multimedia information exchange, are also data- 
centric XML documents. Due to its advantages, the relational DBMS is the best choice for 
storing such XML documents. Storing XML data in relational DBMS can be classified into two 
classes of storage model: structure-mapping and model-mapping. However, the structure- 
mapping model cannot support complex Xpath-based query efficiently and model mapping 
approach lacks the flexible capability in representing al ... 

Keywords: MPEG-7, SM3, relational DBMS, storing XML documents 
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David Durand, Paul Kahn 

May 1998 Proceedings of the ninth ACM conference on Hypertext and hypermedia : 
links, objects, time and space — structure in hypermedia systems: links, 
objects, time and space — structure in hypermedia systems 

Full text available: ^ pdf(1.52 MB) Additional Information: full citation, references , citing s, index terms 



8 Document management: Context representation, transformation and comparison for ad J 
hoc product data exchan ge 
Jingzhi Guo, Chengzheng Sun 

November 2003 Proceedings of the 2003 ACM symposium on Document engineering 

Full text available: ^ pdf( 2 75.65 KB ) Additional Information: full citation , abstr ac t, references , index terms 

Product data exchange is the precondition of business interoperation between Web-based 
firms. However, millions of small and medium sized enterprises (SMEs) encode their Web 
product data in ad hoc formats for electronic product catalogues. This prevents product data 
exchange between business partners for business interoperation. To solve this problem, this 
paper has proposed a novel concept-centric catalogue engineering approach for 
representing, transforming and comparing semantic contexts in a ... 

Keywords: XML product map, XPM, ad hoc product data exchange, concept, context 
comparison, context representation, context transformation, electronic commerce, electronic 
product catalogue, product data integration, semantics 



9 M ode li ng the storage architectures of commerc i al database systems 
D. S. Batory 

December 1985 ACM Transactions on Database Systems (TODS), volume 10 issue 4 

Full text available* f£] pdf( 4 46 MB) Additional Information: full citation , abstract , reference s, citings, index 
^ terms , review 

Modeling the storage structures of a DBMS is a prerequisite to understanding and optimizing 
database performance. Previously, such modeling was very difficult because the 
fundamental role of conceptual-to-internal mappings in DBMS implementations went 
unrecognized. In this paper we present a model of physical databases, called the 
transformation model, that makes conceptual-to-internal mappings explicit. By exposing 
such mappings, we show that it is possible to model the storage ... 

1 0 XIRQL: An XML que r y lan g u ag e based on information retrieval c on cepts 
Norbert Fuhr, Kai Gropjohann 

April 2004 ACM Transactions on Information Systems (TOIS), Volume 22 issue 2 

Full text available: Iji pdf(281 91 KB) Additional Information: full citation, abstract , references , citings, index 
■ i£j . terms 

XIRQL ("circle") is an XML query language that incorporates imprecision and vagueness for 
both structural and content-oriented query conditions. The corresponding uncertainty is 
handled by a consistent probabilistic model. The core features of XIRQL are (1) document 
ranking based on index term weighting, (2) specificity-oriented search for retrieving the 
most relevant parts of documents, (3) datatypes with vague predicates for dealing with 
specific types of content and (4) structural vagueness f ... 

Keywords: Path algebra, XML, XQuery, probabilistic retrieval, ranked retrieval, vague 
predicates 
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An automated approach for retrieving hierarchical data from HTML tab l es 
Seung-Jin Lim, Yiu-Kai Ng 

November 1999 Proceedings of the eighth international conference on Information and 
knowledge management 

Full text available: IS pdf(1 .74 MB) Additional Information: full citation, abstract , references, cfflngs, Index 

t ernri 3 

Among the HTML elements, HTML tables [RHJ98] encapsulate hierarchically structured data 
(hierarchical data in short) in a tabular structure. HTML tables do not come with a rigid 
schema and almost any forms of two-dimensional tables are acceptable according to the 
HTML grammar. This relaxation complicates the process of retrieving hierarchical data from 
HTML tables. In this paper, we propose an automated approach for retrieving hierarchical 
data from HTML tables. The proposed approach constr ... 

12 S pecial issue on knowled g e representation 
Ronald J. Brachman, Brian C. Smith 
February 1980 ACM SIGART Bulletin, issue 70 

Full text available: |j| pdf(13.1 3 MB) Additional Information: full citation , abstract 

In the fall of 1978 we decided to produce a special issue of the SIGART Newsletter devoted 
to a survey of current knowledge representation research. We felt that there were twe 
useful functions such an issue could serve. First, we hoped to elicit a clear picture of how 
people working in this subdiscipline understand knowledge representation research, to 
illuminate the issues on which current research is focused, and to catalogue what 
approaches and techniques are currently being developed. Secon ... 

13 Web clusterin g and usa g e minin g : Clusterin g documen ts in a web director y 
Giordano Adami, Paolo Avesani, Diego Sona 

November 2003 Proceedings of the 5th ACM international workshop on Web 
information and data management 

Full text available: |§ pdf(1 80.53 KB) Additional Information: full citation , abstract , references , index terms 

Hierarchical categorization of documents is a task receiving growing interest due to the 
widespread proliferation of topic hierarchies for text documents. The worst problem of 
hierarchical supervised classifiers is their high demand in terms of labeled examples, whose 
amount is related to the number of topics in the taxonomy. Hence, bootstrapping a huge 
hierarchy with a proper set of labeled examples is a critical issue. In this paper, we propose 
some solutions for the bootstrapping problem, imp ... 

Keywords: TaxSOM, constrained clustering, digital libraries, k-means, knowledge 
management, taxonomy bootstrapping process, text categorization, web directories 



Automatin g XML documents transformations: a conceptual modellin g based approach j| 
A. Boukottaya, C. Vanoirbeek, F. Paganelli, O. Abou Khaled 

January 2004 Proceedings of the first Asian-Pacific conference on Conceptual 
modelling - Volume 31 CRPIT '04 

Full text available: pdf(366.94 KB) Additional Information: full citation , abstract , references 

The growing use of XML mark-up language has made a large amount of heterogeneous XML 
documents widely available. As the number of applications that utilize heterogeneous XML 
documents grows, the importance of XML documents transformations increases greatly. A 
serious obstacle for translating directly between two XML documents, using languages like 
XSLT, is that a mapping between the two XML representations needs to be carefully 
specified by a human expert. Current research attempts to address th ... 

Keywords: Layered Interoperability Model for XML Schemas, automating XML documents 
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15 Checking the temporal integrity of interactive multimedia documents 
I. Mirbel, B. Pernici, T. Sellis, S. Tserkezoglou, M. Vazirgiannis 

July 2000 The VLDB Journal — The International Journal on Very Large Data Bases, 

Volume 9 Issue 2 

Full text available:^ pdf(269. 63 KB) Additional Information: full citation , abstract, citings, index terms 

When authoring multimedia scenarios, and in particular scenarios with user interaction, 
where the sequence and time of occurrence of interactions is not predefined, it is difficult to 
guarantee the consistency of the resulting scenarios. As a consequence, the execution of the 
scenario may result in unexpected behavior or inconsistent use of media. The present paper 
proposes a methodology for checking the temporal integrity of interactive multimedia 
document (IMD) scenarios at authoring ti ... 

Keywords: Constraint networks, Multimedia presentation, Temporal integrity 




16 Petri-net-based hypertext: docume nt s t r uct u re with browsin g semantics 
P. David Stotts, Richard Furuta 

January 1989 ACM Transactions on Information Systems (TOIS), volume 7 issue l 

Full text available: pdf(2.19 MB) Additional Information: f ull citation , abstract, references , citings, index 

terms , review 

We present a formal definition of the Trellis model of hypertext and describe an authoring 
and browsing prototype called &agr;Trellis that is based on the model. The Trellis model not 
only represents the relationships that tie individual pieces of information together into a 
document (i.e., the adjacencies), but specifies the browsing semantics to be associated with 
the hypertext as well (i.e., the manner in which the information is to be visited and 
presented). The model is based on Petri ... 

1 7 XAS: a system for accessin g cornponentized, virtual XML documents 
Ming-Ling Lo, Shyh-Kwei Chen, Sriram Padmanabhan, Jen-Yao Chung 

July 2001 Proceedings of the 23rd International Conference on Software Engineering 

Full text available: ■g„Rdf(U3J9„KB) Additional Information: full citation , abstract , references , citings , index 
f P Pub li s h e r Site t^QS 

XML is emerging as an important format for describing the schema of documents and data 
to facilitate integration of applications in a variety of industry domains. An important issue 
that naturally arises is the requirement to generate, store and access XML documents. 

It is important to reuse existing data management systems and repositories for this 
purpose. In this paper, we describe the XML Access Server (XAS), a general purpose XML 
based storage and retrieval system which ... 

18 Model-driven development of Web applications: the AutoWeb system 
Piero Fraternali, Paolo Paolini 

October 2000 ACM Transactions on Information Systems (TOIS), volume 18 issue 4 

Full text available: S pdf(6.94 MB) Additional Information: full citation , abstract , references , citings , index 
^ terms 

This paper describes a methodology for the development of WWW applications and a tool 
environment specifically tailored for the methodology. The methodology and the 
development environment are based upon models and techniques already used in the 
hypermedia, information systems, and software engineering fields, adapted and blended in 
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an original mix. The foundation of the proposal is the conceptual design of WWW 
applications, using HDM-lite, a notation for the specification of structure, nav ... 

Keywords: HTML, WWW, application, development, intranet, modeling 



19 Link aggreg ation: Untan g ling compound documents on the web 
Nadav Eiron, Kevin S. McCurley 

August 2003 Proceedings of the fourteenth ACM conference on Hypertext and 
hypermedia 

Full text available* fi3 p.df(1 92 59 KB) Additional Information: Mcitatton, abstract, references , citings, index 
' ~ terms 

Most text analysis is designed to deal with the concept of a "document", namely a cohesive 
presentation of thought on a unifying subject. By contrast, individual nodes on the World 
Wide Web tend to have a much smaller granularity than text documents. We claim that the 
notions of "document" and "web node" are not synonymous, and that authors often tend to 
deploy documents as collections of URLs, which we call "compound documents". In this 
paper we present new techniques for identifying and workin ... 

Keywords: composites, hypertext, information retrieval, semantic web, wasted space 



20 PEN: A hierarchica l d ocument editor 
Todd Allen, Robert Nix, Alan Perlis 

June 1981 ACM SIGPLAN Notices , Proceedings of the ACM SIGPLAN SIGOA 

symposium on Text manipulation, volume 16 issue 6 
Full text available* f3 pdf(834 17 KB) Additional Information: full citation , abstract , references , citings , index 

= terms 

Three terms in common usage in computerized text processing are text-editing, word- 
processing, and computer controlled typesetting. This paper deals with a fourth term, 
manuscript preparation, that has important intersections with the above three. A 
computerized manuscript preparation system is one that supports an author in the 
preparation of a manuscript. In what follows we deal with one such, the PEN sys ... 
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